Segmenting a document by stylistic character

نویسندگان

  • Neil Graham
  • Graeme Hirst
چکیده

As part of a larger project to develop an aid for writers that would help to eliminate stylistic inconsistencies within a document, we experimented with neural networks to find the points in a text at which its stylistic character changes. Our best results, well above baseline, were achieved with time-delay networks that used features related to the author’s syntactic preferences. Low-level and vocabulary-based features were not found to be useful.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Segmenting documents by stylistic character

As part of a larger project to develop an aid for writers that would help to eliminate stylistic inconsistencies within a document, we experimented with neural networks to find the points in a text at which its stylistic character changes. Our best results, well above baseline, were achieved with time-delay networks that used features related to the author’s syntactic preferences, whereas low-l...

متن کامل

Style-Directed Document Recognition

We are developing a document recognition system that can be tunably optimized for performance on documents of specific styles. We interactively generate XML to encode specific knowledge about a class of documents to be input to a recognition system. The encoding includes attributes of document logical structure as well as layout structure constraints. The encoding of document style is used to a...

متن کامل

Methodology of Stylistic Study in Persian Carpet Design and Motifs

The Iranian carpets, relative to clime and culture of the productive societies, have accepted various visual and technical characters. Scientific and Methodological study of this character and their formative principles is one of the permanent debates among researchers. The main issue of this research is to consider the diversity and individuality of research and analyze methods in carpet studi...

متن کامل

Intrinsic Plagiarism Detection Using Character n-gram Profiles

The task of intrinsic plagiarism detection deals with cases where no reference corpus is available and it is exclusively based on stylistic changes or inconsistencies within a given document. In this paper a new method is presented that attempts to quantify the style variation within a document using character n-gram profiles and a style change function based on an appropriate dissimilarity mea...

متن کامل

Devnagari document segmentation using histogram approach

Document segmentation is one of the critical phases in machine recognition of any language. Correct segmentation of individual symbols decides the accuracy of character recognition technique. It is used to decompose image of a sequence of characters into sub images of individual symbols by segmenting lines and words. Devnagari is the most popular script in India. It is used for writing Hindi, M...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003